Pandas Replace Value in a Dataframe

您所在的位置:网站首页 pandas value Pandas Replace Value in a Dataframe

Pandas Replace Value in a Dataframe

2023-01-10 12:12| 来源: 网络整理| 查看: 265

Pandas Replace Value in a Dataframe

Author: Aditya Raj Last Updated: January 4, 2023

Pandas dataframes are used to manipulate tabular data in Python. Sometimes, while manipulating the data, we need to replace certain values in the pandas dataframe. In this article, we will discuss different ways to replace a value in a pandas dataframe. 

Table of ContentsThe replace() MethodReplace Value in a Series in PythonPandas Replace Single Value in the Entire DataframeReplace Value in a Single Column in a DataframePandas Replace Different Value in Each ColumnReplace Value Inplace in a Pandas DataframeConclusionThe replace() Method

To replace one or more values in a pandas dataframe, you can use the replace() method. It has the following syntax.

DataFrame.replace(to_replace=None, value=_NoDefault.no_default, *, inplace=False, limit=None, regex=False, method=_NoDefault.no_default)

Here, 

The to_repalce parameter takes a string, regex, list, dictionary, series, integer, or a floating point number as its input argument. If the input given to the to_replace parameter is a string, integer, floating point number, or a regex, the values matching to the input are replaced by the input given to the value parameter. If we pass a list of strings, numeric values, or regexes to the to_replace parameter, it works in two ways. If the input given to the value parameter is a single value, all the elements of the list passed to the to_replace parameter are replaced by the same value.If the input given to the value parameter is a list, lists given to both the to_replace parameter and the value parameter must have equal length. The values in the list given to the to_replace parameter are replaced by the values at the corresponding position in the list given to the value parameter. If the input given to the to_replace parameter is a python dictionary, it works in two ways.If the value parameter is set to None, the keys of the dictionary are replaced with the associated values.If the value parameter is not None, the keys of the dictionary should be column names and the associated values are the values to be replaced with the input given to the value parameter.By default, the replace() method returns a new dataframe. If you want to modify the original dataframe, you can set the inplace parameter to True.When we specify the to_replace parameter and the value parameter is set to None, the replace() method works as the pandas fillna method. In this case, the values given to the to_replace parameter are first replaced with NaN. Then, the nan values are replaced using the method specified in the method parameter. You can specify the values ‘pad’, ‘ffill’, and ‘bfill’  for pad, forward fill, and backward fill respectively.The limit parameter is used to fill nan values when the replace() method works as the fillna() method. The regex parameter is used to specify whether to interpret to_replace and/or value as regular expressions. If this is True then to_replace must be a string. Alternatively, this could be a regular expression or a list, dict, or array of regular expressions in which case to_replace must be None.

After execution, the replace() method returns a new dataframe if the inplace parameter is set to False. Otherwise, it returns None. If invoked on a pandas series, the replace() method returns a series.

Replace Value in a Series in Python

To replace a value in a series, we will pass the value to be replaced and the new value to the replace() method as shown in the following example.

import pandas as pd import numpy as np numbers=[3,23,100,14,16,100,45,65] series=pd.Series(numbers) print("The series is:") print(series) newSeries=series.replace(100,"Max") print("The updated series is:") print(newSeries)

Output:

The series is: 0 3 1 23 2 100 3 14 4 16 5 100 6 45 7 65 dtype: int64 The updated series is: 0 3 1 23 2 Max 3 14 4 16 5 Max 6 45 7 65 dtype: object

In this example, we first created a series using a python list. Then, we invoked the replace() method on the series with 100 as its first input argument and the python literal “Max” as the second input argument. After execution, the replace() method replaces each instance of 100 with "Max" and returns a new series.

Pandas Replace Single Value in the Entire Dataframe

To replace a value in a pandas dataframe, We will invoke the replace() method on the dataframe. Here, we will pass the value that needs to be replaced as the first input argument and the new value as the second input argument to the replace() method as shown below.

import pandas as pd myDicts=[{"Roll":1,"Maths":100, "Physics":87, "Chemistry": 82}, {"Roll":2,"Maths":75, "Physics":100, "Chemistry": 90}, {"Roll":3,"Maths":87, "Physics":84, "Chemistry": 76}, {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}, {"Roll":5,"Maths":90, "Physics":87, "Chemistry": 84}, {"Roll":6,"Maths":79, "Physics":75, "Chemistry": 72}] df=pd.DataFrame(myDicts) print("The input dataframe is:") print(df) newDf=df.replace(100,"Max") print("The updated dataframe is:") print(newDf)

Output:

The input dataframe is: Roll Maths Physics Chemistry 0 1 100 87 82 1 2 75 100 90 2 3 87 84 76 3 4 100 100 90 4 5 90 87 84 5 6 79 75 72 The updated dataframe is: Roll Maths Physics Chemistry 0 1 Max 87 82 1 2 75 Max 90 2 3 87 84 76 3 4 Max Max 90 4 5 90 87 84 5 6 79 75 72

In the above example, we first converted a list of dictionaries to dataframe. Then, we invoked the replace() method on the dataframe with 100 as its first input argument and "Max" as the second input argument. After execution, the replace() method replaces each instance of 100 with "Max" in the original dataframe and returns a new dataframe.

Replace Value in a Single Column in a Dataframe

Instead of replacing value in the entire dataframe, you can also replace a value in a single column of a pandas dataframe.

To replace a value in a specific column, we will invoke the replace() method on the column instead of the entire dataframe.You can observe this in the following example.

import pandas as pd myDicts=[{"Roll":1,"Maths":100, "Physics":87, "Chemistry": 82}, {"Roll":2,"Maths":75, "Physics":100, "Chemistry": 90}, {"Roll":3,"Maths":87, "Physics":84, "Chemistry": 76}, {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}, {"Roll":5,"Maths":90, "Physics":87, "Chemistry": 84}, {"Roll":6,"Maths":79, "Physics":75, "Chemistry": 72}] df=pd.DataFrame(myDicts) print("The input dataframe is:") print(df) df["Maths"]=df["Maths"].replace(100,"Max") print("The updated dataframe is:") print(df)

Output:

The input dataframe is: Roll Maths Physics Chemistry 0 1 100 87 82 1 2 75 100 90 2 3 87 84 76 3 4 100 100 90 4 5 90 87 84 5 6 79 75 72 The updated dataframe is: Roll Maths Physics Chemistry 0 1 Max 87 82 1 2 75 100 90 2 3 87 84 76 3 4 Max 100 90 4 5 90 87 84 5 6 79 75 72

In the above example, we have invoked the replace() method on a column of the dataframe. After execution, the replace() method returns a new series object. We then assign the same object to the existing column in the dataframe.

Pandas Replace Different Value in Each Column

If you want to replace different values in different columns with a single final value, you can pass a dictionary to the replace() method as the first input argument.

Here, the dictionary should contain the column names as its keys and the values that need to be replaced in the columns as the corresponding values of the keys in the dictionary. You can specify the replacement value as the second input argument to the replace() method. After execution, you will get the desired output as shown below.

import pandas as pd myDicts=[{"Roll":1,"Maths":100, "Physics":87, "Chemistry": 82}, {"Roll":2,"Maths":75, "Physics":100, "Chemistry": 90}, {"Roll":3,"Maths":87, "Physics":84, "Chemistry": 76}, {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}, {"Roll":5,"Maths":90, "Physics":87, "Chemistry": 84}, {"Roll":6,"Maths":79, "Physics":75, "Chemistry": 72}] df=pd.DataFrame(myDicts) print("The input dataframe is:") print(df) newDf=df.replace({"Maths":100,"Physics":100, "Chemistry":90},"Max") print("The updated dataframe is:") print(newDf)

Output:

The input dataframe is: Roll Maths Physics Chemistry 0 1 100 87 82 1 2 75 100 90 2 3 87 84 76 3 4 100 100 90 4 5 90 87 84 5 6 79 75 72 The updated dataframe is: Roll Maths Physics Chemistry 0 1 Max 87 82 1 2 75 Max Max 2 3 87 84 76 3 4 Max Max Max 4 5 90 87 84 5 6 79 75 72

In the original dataframe, the column "Chemistry" has 90 has its highest value. So, when we replace 100 with "Max", we cannot specify the rows that have maximum marks in Chemistry.

To specify the value to replace in each column, we have passed a python dictionary containing the column names as the keys and the maximum value in each column as the associated value to the replace() method as its first input argument and the term "Max" as the second input argument. Hence, after execution of the replace() method replaces the value 100 in the columns "Maths", and "Physics". In the column "Chemistry", it replaces the value 90 with "Max" as specified in the dictionary.

Replace Value Inplace in a Pandas Dataframe

In the above examples, the replace() method returns a new dataframe or series after execution. If you want to modify the existing series or dataframe after using the replace() method, you can set the inplace parameter to True. After this, the original series or dataframe will be modified. You can observe this in the following example.

import pandas as pd import numpy as np numbers=[3,23,100,14,16,100,45,65] series=pd.Series(numbers) print("The series is:") print(series) series.replace(100,"Max",inplace=True) print("The updated series is:") print(series)

Output:

The series is: 0 3 1 23 2 100 3 14 4 16 5 100 6 45 7 65 dtype: int64 The updated series is: 0 3 1 23 2 Max 3 14 4 16 5 Max 6 45 7 65 dtype: object

In this example, we have set the inplace parameter to True in the replace() method. Hence, the replace() method modifies the original series instead of returning a new series.

In a similar manner, you can replace a value in a pandas dataframe inplace as shown in the following example.

import pandas as pd myDicts=[{"Roll":1,"Maths":100, "Physics":87, "Chemistry": 82}, {"Roll":2,"Maths":75, "Physics":100, "Chemistry": 90}, {"Roll":3,"Maths":87, "Physics":84, "Chemistry": 76}, {"Roll":4,"Maths":100, "Physics":100, "Chemistry": 90}, {"Roll":5,"Maths":90, "Physics":87, "Chemistry": 84}, {"Roll":6,"Maths":79, "Physics":75, "Chemistry": 72}] df=pd.DataFrame(myDicts) print("The input dataframe is:") print(df) df.replace({"Maths":100,"Physics":100, "Chemistry":90},"Max",inplace=True) print("The updated dataframe is:") print(df)

Output:

The input dataframe is: Roll Maths Physics Chemistry 0 1 100 87 82 1 2 75 100 90 2 3 87 84 76 3 4 100 100 90 4 5 90 87 84 5 6 79 75 72 The updated dataframe is: Roll Maths Physics Chemistry 0 1 Max 87 82 1 2 75 Max Max 2 3 87 84 76 3 4 Max Max Max 4 5 90 87 84 5 6 79 75 72Conclusion

In this article, we have discussed different ways to replace a value in a pandas dataframe and series. We also discussed how to replace different values in different columns by a single value.

To learn more about python programming, you can read this article on how to sort a pandas dataframe. You might also like this article on how to drop columns from a pandas dataframe.

I hope you enjoyed reading this article. Stay tuned for more informative articles.

Happy Learning!

RelatedRecommended Python Training

Course: Python 3 For Beginners

Over 15 hours of video content with guided instruction for beginners. Learn how to create real world applications and master the basics.

Enroll Now

Filed Under: Basics Author: Aditya Raj



【本文地址】


今日新闻


推荐新闻


CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3